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Abstract 

Grebinski and Kucherov (1998) and Alon et al. (2004-2005) study the prob- 
lem of learning a hidden graph for some especial cases, such as hamiltonian 
cycle, cliques, stars, and matchings. This problem is motivated by problems in 
chemical reactions, molecular biology and genome sequencing. 

In this paper, we present a generalization of this problem. Precisely, we con- 
sider a graph G and a subgraph H of G and we assume that G contains exactly 
one defective subgraph isomorphic to H. The goal is to find the defective sub- 
graph by testing whether an induced subgraph contains an edge of the defective 
subgraph, with the minimum number of tests. We present an upper bound for 
the number of tests to find the defective subgraph by using the symmetric and 
high probability variation of Lovasz Local Lemma. 

Keywords: Group testing on graphs, Non-adaptive algorithm, Combinatorial 
search, Learning hidden subgraph. 



1 Introduction 

In the classic group testing problem that was first introduced by Dorfman[9], there 
is a set of n items which contain at most d defective items. The goal of this problem 
is to find the defective items with the minimum number of tests. Every test consists 
of some items and each test is positive if it contains at least one defective item 
and otherwise, test is negative. There are two types of algorithms for group testing 
problem, sequential and non-adaptive. In the sequential algorithm, the outcome of 
previous tests can be used in the future tests and in non-adaptive algorithm all tests 
perform in the same time and by the result of all tests we should find the defective 
items. Group testing has many applications including finding pattern in data, DNA 
library screening, and so on, for an overview of results and more applications, we 
refer the reader to [10, 11, 15]. 

Aigner [1] proposed the problem of group testing on graphs. In this problem, we 
are looking for one defective edge of the given graph G by performing the minimum 
sequential tests, where each test is an induced subgraph of the graph G and the test 
is positive if it contains the defective edge. 

In this paper, we consider the problem of non-adaptive group testing on graphs. 
Suppose that there is a defective subgraph (not necessarily induced subgraph) of G 
isomorphic to a graph H . Each test F is an induced subgraph of G and the outcome 
of test is positive if and only if F has at least one edge in common with the defective 
subgraph. This is a generalization of the problem of learning hidden subgraph posed 
in [12, 2, 3]. More precisely, in learning hidden subgraph problem, the graph G is a 
complete graph, see [12] for some alternative formulation. 
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This problem has many applications in chemical reactions, molecular biology and 
genome sequencing. In chemical reactions, we have a set of chemicals, some pairs 
of them may be have a reaction. Moreover, before testing we know some pairs have 
no reaction. When some chemicals combined in one test, a reaction occurred if and 
only if at least on pair of the chemicals in the test reacts. Our goal is to identify 
which pairs react using as few tests as possible. We can reformulate this problem as 
follows. Suppose that there are n vertices and two vertices u and v are adjacent if 
and only if two chemicals u and v may be have a reaction. Each pair of chemicals 
where have a reaction shows a defective edge and finding all these type of pairs is 
equal to find the defective subgraph. Since we are aware that some pairs have no 
reaction, the graph G is not necessarily a complete graph. 

There are various families of hidden graphs to study. Many recent studies focus 
on two cases: Hamiltonian cycles and matchings [3, 6, 12] which have specific appli- 
cation to the genome sequencing and DNA physical mapping. For more information 
about these applications and their results refer to [5, 7, 12]. 

2 Notation 

Throughout this paper, we suppose that H is a subgraph of G with k edges. More- 
over, we assume that G contains exactly one defective subgraph isomorphic to H. 
We denote the maximum degree of H by A = A (H). Also, G[X] denotes the sub- 
graph of G induced by X n V (G) and for any vertex v € G, Nh(v) stands for the set 
of neighbours of the vertex v in the graph H. Hereafter, we assume that the sub- 
graph H has no isolated vertex, because in the problem of group testing on graphs, 
vertices are not defective. 

A boolean matrix is said to be d-disjunct if for every column Co and every choice 
of d columns C\, Ci , . . . , Cd ( different from Co), there is at least one row such that 
the entry corresponding to Co is 1 and the entries corresponding to C\, C 2 , ■ ■ ■ , Cd 
are all zeros. This concept was first introduced in [14]. 

3 Main result 

For 1 < l < t, let J~i be a random test obtained by choosing each vertex of V{G) 
independently with probability p. For simplicity of notation we write F % as an 
induced subgraph of G on vertices of 

Throughout this paper, let H 1 , H 2 , , H rn be all the subgraphs of G isomorphic 
to H. Let C be a random t x m matrix such that for any l and j . where 1 < j < m 
and 1 < l < t, if E(F[ n Hj) 7^ 0, then cjj = 1; otherwise, cy = 0. The Zth row 
of this matrix corresponds to the test Fj and the jth column corresponds to the 
subgraph Hj. For any where 1 < i 7^ j < m and 1 < l < t, define the event 

A\ j to be the set of all matrices C such that cu < cij. Also, define the event 
to be the set of all matrices C such that for every l, 1 < l < t, we have q, < cy. 
In other words, if the event A\ rj occurs, then the test T) cannot distinguish between 
Hi and Hj. Also, if the event Aj 3 occurs, then for every l such that 1 < l < t, 
the test T) cannot distinguish between Hi and Hj. We would like to bound the 
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probability that none of the bad events Ajj occur. In such cases, when there is 
some relatively small amount of dependence between events, one can use a powerful 
generalization of the union bound, known as the Lovasz Local Lemma. The main 
device in establishing the Lovasz Local Lemma is a graph called the dependency 
graph. Let A\, A 2 , . . . , A n be events in an arbitrary probability space. A graph 
D = (V, E ) on the set of vertices V = {1,2, ... , ?z} is a dependency graph for events 
Ai, A 2 , • ■ • , A n if for each 1 < i < n the event A; is mutually independent of all the 
events {Aj : {i,j} fi E}. we state the Lovasz Local Lemma as follows. 

Lemma A. [4] ( Lovasz Local Lemma, Symmetric Case). Suppose that A\, A 2 , . . . , A n 
are events in a probability space with Pr(Aj) < p for all i. If the maximum degree 
in the dependency graph of these events is d, and if ep(d + 1) < 1, then 

n 

Pr(f > |A-) >0, 

2=1 

where e is the basis of the natural logarithm. 



To find the maximum degree in the dependency graph of the events Aj j, we de- 
fine the parameter rc(H) as follows. Set rQ(H,Hj) is the number of subgraphs 
isomorphic to H whose intersection with Hj is nonempty and define rc(H) = 

rna xr G (H,Hi). 
i 

In the main theorem, we show that the aforementioned random matrix is a 1- 
disjunct matrix with positive probability. A well known theorem stated in [13], 
asserts that to find the defective items, it is sufficient to create a disjunct matrix. 
More precisely, in the main theorem, we prove there is a t x m matrix C such 
that for every i and j, there are two distinct numbers 1 <1,1 < t, such that 
Cij = 1, Cfj = 0 and CV,j = 0, C)ij = 1 . So if Hj is a defective subgraph, then for 
every non-defective subgraph Hj , there exists a test Fj such that E(Fi)P\E{Hj) = 0 
and E(Fi ) n E(Hj) 7 ^ 0 . So the test T) distinguish between the defective subgraph 
Hj and every non-defective subgraph Hj. Therefore, by this matrix we can find 
every non-defective subgraph isomorphic to H . 



Theorem 1. Let H be the defective subgraph with E{H) = k, A (H) = A. One can 
find the subgraph H with t non- adaptive tests, where 



t — 1 + 



ln(4erc(iL)) + lnm 



In 



1 

1— - Ac, A 



Pk, A = 2SA (! - 2k) ( 1 - J 2M. (! “ 2k) j ? and e is the basis of the 

natural logarithm. 



To prove the main theorem, we need some supportive results. 

Lemma 1. Let H be a graph with n vertices, k edges, and maximum degree A. 
Pick, randomly and independently, each vertex of H with probability p, where p = 
,/|[(l — | If F is the set of all chosen vertices, then H[F] is an independent 

set with probability at least 1 — e. 
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To prove this lemma, we need high probability variation of Lovasz Local Lemma. 

Lemma B. [8] Let B\, B 2 , ■ . . , be events in a probability space. Suppose that 

each event Bi is independent of all the events Bj but at most d. For 1 < i < k and 

k 

0 < e < 1, ifPr(Bi) < f (1 - f) d , then Pr( Q B^j > 1 — e. 

i = 1 

Proof of Lemma 1. Let E{H ) = {ei,e 2 , • . . , efc}. For 1 < i < k, we define Bi to 
be the event that e* € E{H[F\), so Pr{Bf) = p 2 . Since vertices are chosen randomly 
and independently, the event Bi is independent of the event Bj if and only if edges 

e* and ej have no common vertex. So the maximum degree of the dependency graph 

k 

is at most 2(A — 1). Since p 2 < | (l — f) 2(A by Lemma 1, Pr(^ B^j > 1 — e. 

i = 1 

Hence, H[F] is an independent set with probability at least 1 — e. ■ 

The problem of finding the probability that tests F \ , F 2 , .... F t , distinguish be- 
tween every pair of subgraphs H{ and Ftj, falls into different lemmas as follows. 

Lemma 2. IfV(H l ) = V(Hj) and \E{H{) \ E(Hj) \ = 1, then 

Pr(E(F l nHi)^0,E(F l nH j ) = 0) > p 2 (l - p) 2A (l - e), 

where p = (1 — e) A_1 . 

Proof. Let e = {u, v} € E(Hi ) \ E(Hj). Consider the induced subgraph H' , 
where V(FF) = V(Hj ) \ |«UdU N(u) U N(v)^ . Note that E(Fi n Hi) ^ 0 and 
E(F[ C Hj) = 0 if and only if Hj n F[ is an independent set of Hj and u,v € J 7 /. 
Also, one can see that u,v € Fi and Hj [i 7 ]] is an independent set if and only if the 
following events hold 

1. u,v G Ei, 

2. N h . (u) n Ei = 0 and N Hj (v) C\Ei = 0, 

3. H'[F{\ is an independent set. 

It is straightforward to check that the aforementioned events are independent. Also, 
one can see that the event u,v € Ei occurs with probability p 2 . Since \Nh :i (u) U 
N Hj (v) |<2A, 

Pr (. N Hj (u) HEi = 0 & N Hj (v) n Ei = 0) = 

Pr(Ei C (N Hj (u)UN Hj (v) \ {u,u}) = 0 ) > (1 -p) 2A 

Set E(H') = k' . If k' = 0, then FiHH' has no edges. So Pr{E(F l ^H’) = 0 ) = 1. 
Suppose that k' > 1. Since k > k 1 , we havep 2 = |(1 — e) 2A " 2 < p(l — p) 2A_2 - Each 
vertex of the induced subgraph H ' is chosen with probability p. So by Lemma I , the 
induced subgraph on Eif)V(H') is an independent set with probability at least 1 — e. 
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In other words, Pr(E(Fi n H') = 0) > 1 — e. Since the events are independent, we 
have 

Pr(E(F l n Hi) ± 0, E(F l n Hj) = 0) > p 2 (l -p) 2A (l - e), 
as desired. ■ 

Lemma 3. If \V{Hf) \ V(Hj) | = 1, then 

Pr(E(F l 0H i )^0,E{Fi0H j ) = 0 S j > p 2 (l - p) A (l - e), 
where p = (1 - e) A_1 . 

Proof. Since H has no isolated vertex, there exists at least one edge e = {u, v} E 
E(Hi) \ E(Hj). Let v € V{Hi) n V{Hj) and u € V (Hi) \ V(Hj). Suppose that H' is 
an induced subgraph of Hj , where V{H') = V(Hj) \(rU N(v)). Set \E(H')\ = k' . 
Similar to the proof of Lemma 2, E(Fj fi Hj) ^ 0 and E(Fi n Hj) = 0 if and only 
if the following independent events hold 

1. u,v E Fi, 

2. N H .(v)nFi = 0, 

3. H'[Fi\ is an independent set. 

Since {Nhj ( , w)| < A, the probability that Nhj(v) n T\ = 0 is at least (1 — p) A . 
The rest of proof is similar to Lemma 2, so 

Pr(E(F l 0H i )^0,E{Fi0Hj) = 0 S ) > p\l - p) A (l - e), 

as desired. ■ 

Lemma 4. If the induced subgraph on V{Hf) — V(Hj) has at least one edge, then 

Pr( K E(F l n Hi) ± 0, E(F X n Hj) = 0 ) > p\ 1 - e), 

where p = (1 - e) A '^ 1 . 

Proof. Let e = (u,v) E E{Hf) \ E(Hj). If the following independent events hold 

1. u,v E Fi, 

2. Hj[Fi\ is an independent set, 

then E(FinHi) / 0 and E{Fi0Hj ) = 0. Since p 2 = f (1 - e) 2A_2 < f (l - f) 2A_2 , 
by Lemma , Pr ( E(Fi n Hj) = 0) > 1 — e. Also one can see that 

Pr(yE(Fi n Hi) / 0) > Pr[e E E(Fi )) = Pr(u, v E J 7 /) = p 2 . 

Consequently, Pr(E(Fi n Hi) / 0, E(Fi n Hj) = 0) > p 2 (l — e). ■ 
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In the next theorem and in view of the previous lemmas, we show that the 
probability of distinguishing between Hi and Hj has the minimum value whenever 
V(Hi) = V (Hj) and | E(Hi) \ E(Hj)\ = 1. 

Theorem 2. Let \E(H)\ = k and A (H) = A. For every 1 < i ^ j < m and 
1 < l < t, we have 

PrfiQ >/(l-p) 2 A (l-e), ( 1 ) 

where p = (1 - e) A_1 . 

Proof. LtiE(Hi)nE(Hj) = f r } and E(Hi)\E(Hj) = {e u e 2 , . . . , e k _ r }. 

The event A\ ■ occurs if and only if E(Fi n Hi) / 0 and E(Fi fl Hj) = 0 . It is easy 
to check, for every 1 < q < k — r, 

P r (E(Fi n Hi) ± 0, E(F l n Hj) = 0) > Pr (e q € E (F z D Hi), E(F, n Hj) = 0) . 

So we need to consider the following three cases, 
case 1: V(H t ) = V{Hj), \ E(Hi) \ E(Hj) \ = 1. 

By Lemma 2, it is clear Pr (^A\jj > p 2 ( 1 — p) 2A (l ~ e). 

case 2 : | V(Hi)\V(Hj)\ = 1. 

By Lemma 3, we have Pr{A\-) > p 2 { 1 — p) A (l — e) > p 2 (l — p) 2A (l — e). 
case 3: The induced subgraph on V(Hi) — V(Hj) has at least one edge. 

By Lemma , Pr — P 2 { 1 — e) > p 2 ( 1 — p) 2A (l — e). 

So for every 1 < i / j < m and l < l < t, Pr ^ A\ ^ > p 2 ( 1 — p) 2A ( 1 — e). ■ 

To prove the main theorem, we present an upper bound for the probability of 
occurring the bad events Aij for every 1 < i 7 ^ j < m. 

Theorem 3. Let \E(H)\ = k and A (H) = A. For every 1 < i ^ j < m, we have 

Pr(Aij) < (1 - P kA )\ ( 2 ) 

where P kA = ^ (l - ^) 2A ~ l ( X “ \/5s i 1 ~ Ja)^) 

Proof. Since E\.E 2 , . . . , Tt C V(G) are chosen randomly and independently, the 
events A\ - , . . . , .A* • are mutually independent. So 

Pr{A i j) = (Pr{A\ ij )) t . 

By the definition of A\ rj and Theorem 2, we have 

Pr fiQ = Pr n H i) + 0 ’ E ( p i n H i) = 0 ) > p\ l - P? A ( l - 0- 
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2A 



We set e = . So Pr 



A\ • ) > P k , a, where 



Therefore, Pr(Aij) = [Pr{A\ -)^j < (1 - P^a)*- ■ 

Now, we are ready to prove the main theorem. 

Proof of Theorem 1. By Theorem 3, for every 1 < i / j < m, Pr(Aj.j) < 

(i - Pk A Y- 

Now we prove that if t > lll f' 1< j' Cj '(^+ lll " t , then by Lovasz Local Lemma, with 

ln T W a 

positive probability no event Aij occurs. 

Construct the dependency graph whose vertices are the events A t j, where 1 < 

i,j < m. Two events Aj j and A^j/ are adjacent if and only if (v(Hi) U n 

(V(H V ) U + 0. Recall that r G (H) = max r G (H . Hj), where rc(H. H 7 ) is 

the number of subgraphs isomorphic to H whose intersection with Hi is nonempty. 
It is straightforward to verify that the maximum degree in the dependency graph is 
at most 4 r G (H)(m — 1). Hence, if 



t > 



ln(4erc(P)) + lnm 



In 



l 

1 — Pk,A 



then e (1 — Pk. aY (4 rc(H)(m — 1) + l) < 1, and by Lovasz Local Lemma 

Pr (rv*j) > °- 

hi 



Therefore, if t = 1 + [~ ln ( 4 ^ r G(^))+ lnm -| ; then with positive probability no event A{ 7 - 

l-Pfc,A 

occurs. So the random matrix C is a 1-disjunct matrix with positive probability. ■ 
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